432 research outputs found
NPLDA: A Deep Neural PLDA Model for Speaker Verification
The state-of-art approach for speaker verification consists of a neural
network based embedding extractor along with a backend generative model such as
the Probabilistic Linear Discriminant Analysis (PLDA). In this work, we propose
a neural network approach for backend modeling in speaker recognition. The
likelihood ratio score of the generative PLDA model is posed as a
discriminative similarity function and the learnable parameters of the score
function are optimized using a verification cost. The proposed model, termed as
neural PLDA (NPLDA), is initialized using the generative PLDA model parameters.
The loss function for the NPLDA model is an approximation of the minimum
detection cost function (DCF). The speaker recognition experiments using the
NPLDA model are performed on the speaker verificiation task in the VOiCES
datasets as well as the SITW challenge dataset. In these experiments, the NPLDA
model optimized using the proposed loss function improves significantly over
the state-of-art PLDA based speaker verification system.Comment: Published in Odyssey 2020, the Speaker and Language Recognition
Workshop (VOiCES Special Session). Link to GitHub Implementation:
https://github.com/iiscleap/NeuralPlda. arXiv admin note: substantial text
overlap with arXiv:2001.0703
Neural PLDA Modeling for End-to-End Speaker Verification
While deep learning models have made significant advances in supervised
classification problems, the application of these models for out-of-set
verification tasks like speaker recognition has been limited to deriving
feature embeddings. The state-of-the-art x-vector PLDA based speaker
verification systems use a generative model based on probabilistic linear
discriminant analysis (PLDA) for computing the verification score. Recently, we
had proposed a neural network approach for backend modeling in speaker
verification called the neural PLDA (NPLDA) where the likelihood ratio score of
the generative PLDA model is posed as a discriminative similarity function and
the learnable parameters of the score function are optimized using a
verification cost. In this paper, we extend this work to achieve joint
optimization of the embedding neural network (x-vector network) with the NPLDA
network in an end-to-end (E2E) fashion. This proposed end-to-end model is
optimized directly from the acoustic features with a verification cost function
and during testing, the model directly outputs the likelihood ratio score. With
various experiments using the NIST speaker recognition evaluation (SRE) 2018
and 2019 datasets, we show that the proposed E2E model improves significantly
over the x-vector PLDA baseline speaker verification system.Comment: Accepted in Interspeech 2020. GitHub Implementation Repos:
https://github.com/iiscleap/E2E-NPLDA and
https://github.com/iiscleap/NeuralPld
Towards Few-shot Entity Recognition in Document Images: A Graph Neural Network Approach Robust to Image Manipulation
Recent advances of incorporating layout information, typically bounding box
coordinates, into pre-trained language models have achieved significant
performance in entity recognition from document images. Using coordinates can
easily model the absolute position of each token, but they might be sensitive
to manipulations in document images (e.g., shifting, rotation or scaling),
especially when the training data is limited in few-shot settings. In this
paper, we propose to further introduce the topological adjacency relationship
among the tokens, emphasizing their relative position information.
Specifically, we consider the tokens in the documents as nodes and formulate
the edges based on the topological heuristics from the k-nearest bounding
boxes. Such adjacency graphs are invariant to affine transformations including
shifting, rotations and scaling. We incorporate these graphs into the
pre-trained language model by adding graph neural network layers on top of the
language model embeddings, leading to a novel model LAGER. Extensive
experiments on two benchmark datasets show that LAGER significantly outperforms
strong baselines under different few-shot settings and also demonstrate better
robustness to manipulations
Activate or inhibit? Implications of autophagy modulation as a therapeutic strategy for Alzheimer’s disease
Neurodegenerative diseases result in a range of conditions depending on the type of proteinopathy, genes affected or the location of the degeneration in the brain. Proteinopathies such as senile plaques and neurofibrillary tangles in the brain are prominent features of Alzheimer’s disease (AD). Autophagy is a highly regulated mechanism of eliminating dysfunctional organelles and proteins, and plays an important role in removing these pathogenic intracellular protein aggregates, not only in AD, but also in other neurodegenerative diseases. Activating autophagy is gaining interest as a potential therapeutic strategy for chronic diseases featuring protein aggregation and misfolding, including AD. Although autophagy activation is a promising intervention, over-activation of autophagy in neurodegenerative diseases that display impaired lysosomal clearance may accelerate pathology, suggesting that the success of any autophagy-based intervention is dependent on lysosomal clearance being functional. Additionally, the effects of autophagy activation may vary significantly depending on the physiological state of the cell, especially during proteotoxic stress and ageing. Growing evidence seems to favour a strategy of enhancing the efficacy of autophagy by preventing or reversing the impairments of the specific processes that are disrupted. Therefore, it is essential to understand the underlying causes of the autophagy defect in different neurodegenerative diseases to explore possible therapeutic approaches. This review will focus on the role of autophagy during stress and ageing, consequences that are linked to its activation and caveats in modulating this pathway as a treatment
Hyperon bulk viscosity and -modes of neutron stars
We propose and apply a new parameterization of the modified chiral effective
model to study rotating neutron stars with hyperon cores in the framework of
the relativistic mean-field theory. The inclusion of mesonic cross couplings in
the model has improved the density content of the symmetry energy slope
parameters, which are in agreement with the findings from recent terrestrial
experiments. The bulk viscosity of the hyperonic medium is analyzed to
investigate its role in the suppression of gravitationally driven -modes.
The hyperonic bulk viscosity coefficient caused by non-leptonic weak
interactions and the corresponding damping timescales are calculated and the
-mode instability windows are obtained. The present model predicts a
significant reduction of the unstable region due to a more effective damping of
oscillations. We find that from K to K, hyperonic
bulk viscosity completely suppresses the -modes leading to a stable region
between the instability windows. Our analysis indicates that the instability
can reduce the angular velocity of the star up to 0.3~, where
is the Kepler frequency of the star.Comment: 9 pages, 9 figures; Accepted for publication in MNRA
- …